2 research outputs found
Target Features Affect Visual Search, A Study of Eye Fixations
Visual Search is referred to the task of finding a target object among a set
of distracting objects in a visual display. In this paper, based on an
independent analysis of the COCO-Search18 dataset, we investigate how the
performance of human participants during visual search is affected by different
parameters such as the size and eccentricity of the target object. We also
study the correlation between the error rate of participants and search
performance. Our studies show that a bigger and more eccentric target is found
faster with fewer number of fixations. Our code for the graphics are publicly
available at https://github.com/ManooshSamiei/COCOSearch18_Analysis.Comment: 5 pages, 3 figure
Predicting Visual Attention and Distraction During Visual Search Using Convolutional Neural Networks
Most studies in computational modeling of visual attention encompass
task-free observation of images. Free-viewing saliency considers limited
scenarios of daily life. Most visual activities are goal-oriented and demand a
great amount of top-down attention control. Visual search task demands more
top-down control of attention, compared to free-viewing. In this paper, we
present two approaches to model visual attention and distraction of observers
during visual search. Our first approach adapts a light-weight free-viewing
saliency model to predict eye fixation density maps of human observers over
pixels of search images, using a two-stream convolutional encoder-decoder
network, trained and evaluated on COCO-Search18 dataset. This method predicts
which locations are more distracting when searching for a particular target.
Our network achieves good results on standard saliency metrics (AUC-Judd=0.95,
AUC-Borji=0.85, sAUC=0.84, NSS=4.64, KLD=0.93, CC=0.72, SIM=0.54, and IG=2.59).
Our second approach is object-based and predicts the distractor and target
objects during visual search. Distractors are all objects except the target
that observers fixate on during search. This method uses a Mask-RCNN
segmentation network pre-trained on MS-COCO and fine-tuned on COCO-Search18
dataset. We release our segmentation annotations of targets and distractors in
COCO-Search18 for three target categories: bottle, bowl, and car. The average
scores over the three categories are: F1-score=0.64, MAP(iou:0.5)=0.57,
MAR(iou:0.5)=0.73. Our implementation code in Tensorflow is publicly available
at https://github.com/ManooshSamiei/Distraction-Visual-Search .Comment: 33 pages, 24 figures, 12 tables, this is a pre-print manuscript
currently under review in Journal of Visio